From:                              route@monster.com

Sent:                               Tuesday, June 04, 2013 3:54 PM

To:                                   hg@apeironinc.com

Subject:                          Please review this candidate for: Big Data

 

This resume has been forwarded to you at the request of Monster User xapeix01

Bin Yu 

Last updated:  08/13/12

Job Title:  no specified

Company:  no specified

Rating:  Not Rated

Screening score:  no specified

Status:  Resume Received


San Jose, CA  95148
US

Mobile: 408-623-2346   
Home:

binyumail@gmail.com
Contact Preference:  Email

Quick View Links:

Resume Section

Summary Section

 

 

RESUME

  

Resume Headline: CV

Resume Value: e6mz723wsnf2q9b8   

  

 

Bin Yu, Ph. D.

San Jose, CA 95148

(408) 623-2346

binyumail@gmail.com

http://divinev.com

 

OVERVIEW

·         More than 10 years of successful track records in design and development of large scale distributed systems for big data analytics and internet search. Strong hands-on experience in text analytics, machine learning, data mining, and large scale distributed system architecture.

·         Demonstrated leadership in technical project and organizational resource management by leading multiple development teams onshore and offshore and delivering high quality products.

 

TECHNICAL EXPERTISE             

·         Deep knowledge in Hadoop ecosystems including HDFS, MapReduce, HBase, Sqoop, Flume, Pig and Hive.

·         Architect level hands-on programming skill in Java, C++, Perl and SQL.

·         Many years experience in big data mining, text analytics, machine learning, internet search, large scale distributed system infrastructure, middleware and tooling.

 

WORK EXPERIENCE

2011-Date: Rearden Commerce

Head of Big Data Analytics Engineering, CTO Organization

·         Assumed increasing responsibility for architecture design and system development of analytics platform and infrastructure to facilitate big data analytics and relevance engine for business intelligence and ecommerce recommendation. Led the effort of building the analytics infrastructure from scratch with Hadoop and its ecosystems including HBase, Mahout, Pig, Hive, as well as AWS cloud technologies.

·         Leading the projects of building and productizing the algorithms of data mining, machine learning for ecommerce personalization in travel booking, deal offer, and B2B e-commerce.

·         Managing the research and development of sentiment analysis using natural language processing technology based on OpenNLP and WordNet.

·         Responsible for data collection from internal and external sources including Oracle, MySQL, popular social network and review sites and integration with data warehouse.

·         Building the Analytics Engineering team and responsible for managing and maintaining the Hadoop cluster to support MapReduce, Pig, and Hive by continuously adding functional modules and UDF libraries.

 

2006-2011: Ask.com (competitor of Google)

Head of Search Metrics and Infrastructure, Search Technology

·         Building and managing the distributed technology teams onshore and offshore with increasing performance and responsibility. Led the team that built HBase like proprietary nosql database based on Google’s BigTable.

·         Assumed broad responsibility in development of new generation of internet search engine infrastructure including middleware, logging system, common services, and tools. Achieved performance breakthrough for core components which leads to an end-to-end throughput of half billion pages per day.

·         Maintained the leadership in a variety of advanced technologies in search engine business. Restructured and streamlined the process for product development of classification, categorization, and ranking algorithm based on machine learning and data mining techniques, and achieved continuous user experience improvement.

·         Led the projects of large scale data analytics for business intelligence, user behavior analysis, and monetization opportunity discovery. The achievement also includes better advertising performance and improved SEM models.

 

2002-2006: Hologic

Senior Manager, Advanced Technology

·         Managed a group of scientists and software engineers in advanced technology department. Greatly lifted the company’s competency by continuously maintaining its leading position in breast cancer detection against competitors.

·         Assumed responsibilities for product specification, prioritization, scheduling, algorithm project management and system architecture design.

·         Managed all aspects of the development of medical data mining, machine learning and neural network algorithms for structured and unstructured medical data analysis and disease diagnosis.

             

2002-2002: Rational Software (part of IBM)

Project Lead, Enterprise Software

·         Led the design and development of ClearQuest charting and reporting components for its new generation SaaS solution.

 

1999-2002: Akamai Technologies

Tech Lead, Streaming Product

·         Responsible for the architecture design and system development of the scalable distributed streaming services. Built the first internet service of web streaming casting for corporate communication.

·         Led the design and development of the decentralized web publishing system, encoder automation system, logging and monitoring system with high degree of scalability, fault tolerance and reliability.

 

1997-1999: Electroglas

Staff Engineer, Computer Vision

·         Designed and developed the machine learning algorithms for defect detection that is one of the key components for successfully building the new generation of products.

 

EDUCATION

·         Postdoctoral Research Fellow, Computer Science, Michigan State University.

·         Ph.D., Electronic Engineering, Tsinghua University.

·         MS, Biomedical Engineering, Tianjin University.

 

PUBLICATIONS

·         CMEIAS: A computer-aided system for image analysis of bacterial morphotypes in microbial communities, Microbial Ecology.

·         Automatic text location in images and video frames, Pattern Recognition.

·         Document representation and its application to page decomposition, IEEE Transactions on Pattern Analysis and Machine Intelligence.

·         A generic system for form dropout, IEEE Transactions on Pattern Analysis and Machine Intelligence,.

·         A robust and fast skew detection algorithm for generic documents, Pattern Recognition.

·         A consistent attribute graph-based hand drawn circuit diagram reading system, Chinese Journal of Electronics.

·         A global optimum clustering algorithm, Engineering Applications of Artificial Intelligence.

·         The image contour extraction of engineering drawings and its applications to recognizing hand writing characters, Journal of Northern Jiaotong University.

·         A more efficient branch and bound algorithm for feature selection, Pattern Recognition.

·         A dynamic selection algorithm for globally optimal subset, Engineering Applications of Artificial Intelligence.

·         BF** algorithm for feature selection and its comparison with BF* algorithm, Acta Electronica Sinica.

·         Isothetic polygon representation for contours, CVGIP: Image Understanding.

·         The tree representation of the graph used in binary image processing, Information Processing Letters.

·         Representation of LAG structure used in binary image processing with extended binary tree, Chinese Journal of Computers.

·         BAG-based vectorization and its application to recognizing hand-drawn logic circuit diagrams, Acta Electronica Sinica,.

·         The image boundary tracing and its application to the recognition of hand-written characters, Acta Electronica Sinica.

·         NMR medical image analysis in high noise, Chinese Journal of Medical Instrumentation.

·         CMEIAS: Center for microbial ecology image analysis system, in Proceedings of the 8th International Symposium on Microbial Ecology, Halifax, Canada.

·         Automatic text location in images and video frames, in Proceedings of the 14th International Conference on Pattern Recognition, Brisbane.

·         Model-based document representation: application to page segmentation, in Proceedings of the 4th International Conference on Document Analysis and Recognition, Ulm.

·         Address block location on complex mail pieces, in Proceedings of the 4th International Conference on Document Analysis and Recognition, Ulm.

·         Lane boundary detection using a multiresolution Hough transform, in Proceedings of the IEEE International Conference on Image Processing, Santa Barbara.

·         A form dropout system, in Proceedings of the 13th International Conference on Pattern Recognition, Vol. 3, Vienna.

·         Document processing research in Michigan State University, in Proceedings of the Symposium on Document Image Understanding Technology, Maryland.

·         Automatic understanding of symbol connected diagrams, in Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal.

·         A feature selection method for multi-class-set classification, in Proceedings of the IEEE International Joint Conference on Neural Network, Vol. 3, Baltimore.

·         The extended binary tree representation of binary image and its application to engineering drawing entry, in Proceedings of the 10th IEEE International Conference on Pattern Recognition, Atlantic.

·         An economical contour extraction algorithm for understanding large-size engineering drawings, in Proceedings of the 1st IEEE International Conference on Systems Integration, Morristown.

·         A BAG-based vectorizer for automatic diagram reader, in Proceedings of the International Conference on CAD & CG, Beijing.

·         The data structure used in image processing and its application to OCR, in Proceedings of the 4th National Conference on Image Science.

·         Nuclear magnetic resonance medical imaging with low field, in Proceedings of the 9th IEEE Annual Conference of the Engineering in Medicine and Biology Society, Boston.

 

KEYWORDS

·         Hadoop, HBase, MapReduce, Pig, Hive, Flume, Sqoop, Mahout, Nutch, Lucene, Solr, MySQL, Data Warehouse, WordNet, OpenNLP

·         Classification, Large Scale Distributed Systems, Data Mining, Data Analytics, Machine Learning, Information Retrieval, Natural Language Processing

·         Big data analytics, Internet Search, Cloud Computing, Amazon AWS EC2

·         Java, C++, Perl, REST, Servlet, JSP, XML, JSON, Web Services, J2EE, SaaS, Tomcat and Application Server

·         Linux, TCP/IP, HTTP, Eclipse, Scrum, Agile

 



Additional Info

BACK TO TOP

 

Current Career Level:

Manager (Manager/Supervisor of Staff)

Years of relevant work experience:

More than 15 Years

Date of Availability:

Immediately

Work Status:

US - I am authorized to work in this country for any employer.

Active Security Clearance:

None

US Military Service:

Citizenship:

US citizen

 

 

Target Job:

Target Job Title:

Leadership in Big Data

 

Target Company:

Company Size:

 

Target Locations:

Selected Locations:

US-CA

Relocate:

No

Willingness to travel:

Up to 25% travel